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Abstract 

We present an extensive analysis of the key prob¬ 
lem of learning optimal reserve prices for gen¬ 
eralized second price auctions. We describe 
two algorithms for this task; one based on den¬ 
sity estimation, and a novel algorithm beneht- 
ing from solid theoretical guarantees and with 
a very favorable running-time complexity of 
0{nS\og{nS)), where n is the sample size and 
S the number of slots. Our theoretical guar¬ 
antees are more favorable than those previously 
presented in the literature. Additionally, we show 
that even if bidders do not play at an equilibrium, 
our second algorithm is still well dehned and 
minimizes a quantity of interest. To our knowl¬ 
edge, this is the hrst attempt to apply learning 
algorithms to the problem of reserve price opti¬ 
mization in GSP auctions. Finally, we present 
the hrst convergence analysis of empirical equi¬ 
librium bidding functions to the unique symmet¬ 
ric Bayesian-Nash equilibrium of a GSR 

1 INTRODUCTION 

The Generalized Second-Price (GSP) auction is currently 
the standard mechanism used for selling sponsored search 
advertisement. As suggested by the name, this mechanism 
generalizes the standard second-price auction of Vickrey 
(1961) to multiple items. In the case of sponsored search 
advertisement, these items correspond to ad slots which 
have been ranked by their position. Given this ranking, 
the GSP auction works as follows: hrst, each advertiser 
places a bid; next, the seller, based on the bids placed, as¬ 
signs a score to each bidder. The highest scored advertiser 
is assigned to the slot in the best position, that is, the one 
with the highest likelihood of being clicked on. The second 
highest score obtains the second best item and so on, until 
all slots have been allocated or all advertisers have been 
assigned to a slot. As with second-price auctions, the bid¬ 
der’s payment is independent of his bid. Instead, it depends 
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solely on the bid of the advertiser assigned to the position 
below. 

In spite of its similarity with second-price auctions, the 
GSP auction is not an incentive-compatible mechanism, 
that is, bidders have an incentive to lie about their valua¬ 
tions. This is in stark contrast with second-price auctions 
where truth revealing is in fact a dominant strategy. It is 
for this reason that predicting the behavior of bidders in a 
GSP auction is challenging. This is further worsened by the 
fact that these auctions are repeated multiple times a day. 
The study of all possible equilibria of this repeated game 
is at the very least difficult. While incentive compatible 
generalizations of the second-price auction exist, namely 
the Vickrey-Clark-Gloves (VCG) mechanism, the simplic¬ 
ity of the payment rule for GSP auctions as well as the large 
revenue generated by them has made the adoption of VCG 
mechanisms unlikely. 

Since its introduction by Google, GSP auctions have gen¬ 
erated billions of dollars across different online advertise¬ 
ment companies. It is therefore not surprising that it has 
become a topic of great interest for diverse helds such as 
Economics, Algorithmic Game Theory and more recently 
Machine Learning. 

The hrst analysis of GSP auctions was carried out inde¬ 
pendently by Edelman et al. (2005) and Varian (2007). 
Both publications considered a full information scenario, 
that is one where the advertisers’ valuations are publicly 
known. This assumption is weakly supported by the fact 
that repeated interactions allow advertisers to infer their ad¬ 
versaries’ valuations. Varian (2007) studied the so-called 
Symmetric Nash Equilibria (SNE) which is a subset of the 
Nash equilibria with several favorable properties. In partic¬ 
ular, Varian showed that any SNE induces an efficient allo¬ 
cation, that is an allocation where the highest positions are 
assigned to advertisers with high values. Eurthermore, the 
revenue earned by the seller when advertisers play an SNE 
is always at least as much as the one obtained by VCG. 
The authors also presented some empirical results showing 
that some bidders indeed play by using an SNE. However, 
no theoretical justihcation can be given for the choice of 




this subset of equilibria (Borgers et al., 2013; Edelman and 
Schwarz, 2010). A finer analysis of the full information 
scenario was given by Lucier et al. (2012). The authors 
proved that, excluding the payment of the highest bidder, 
the revenue achieved at any Nash equilibrium is at least 
one half that of the VCG auction. 


Section 6, we report the results of experiments comparing 
our algorithms and demonstrating in particular the benefits 
of the second algorithm. 

2 MODEL 


Since the assumption of full information can be unrealis¬ 
tic, a more modern line of research has instead considered 
a Bayesian scenario for this auction. In a Bayesian setting, 
it is assumed that advertisers’ valuations are i.i.d. samples 
drawn from a common distribution. Gomes and Sweeney 
(2014) characterized all symmetric Bayes-Nash equilibria 
and showed that any symmetric equilibrium must be effi¬ 
cient. This work was later extended by Sun et al. (2014) 
to account for the quality score of each advertiser. The 
main contribution of this work was the design of an algo¬ 
rithm for the crucial problem of revenue optimization for 
the GSP auction. Lahaie and Pennock (2007) studied dif¬ 
ferent squashing ranking rules for advertisers commonly 
used in practice and showed that none of these rules are 
necessarily optimal in equilibrium. This work is comple¬ 
mented by the simulation analysis of Vorobeychik (2009) 
who quantified the distance from equilibrium of bidding 
truthfully. Lucier et al. (2012) showed that the GSP auc¬ 
tion with an optimal reserve price achieves at least 1/6 of 
the optimal revenue (of any auction) in a Bayesian equilib¬ 
rium. More recently, Thompson and Leyton-Brown (2013) 
compared different allocation rules and showed that an an¬ 
choring allocation rule is optimal when valuations are sam¬ 
pled i.i.d. from a uniform distribution. With the exception 
of Sun et al. (2014), none of these authors have proposed 
an algorithm for revenue optimization using historical data. 

Zhu et al. (2009) introduced a ranking algorithm to learn an 
optimal allocation rule. The proposed ranking is a convex 
combination of a quality score based on the features of the 
advertisement as well as a revenue score which depends on 
the value of the bids. This work was later extended in (He 
et al., 2014) where, in addition to the ranking function, a 
behavioral model of the advertisers is learned by the au¬ 
thors. 

The rest of this paper is organized as follows. In Sec¬ 
tion 2, we give a learning formulation of the problem of 
selecting reserve prices in a GSP auction. In Section 3, we 
discuss previous work related to this problem. Next, we 
present and analyze two learning algorithms for this prob¬ 
lem in Section 4, one based on density estimation extend¬ 
ing to this setting an algorithm of Guerre et al. (2000), and a 
novel discriminative algorithm taking into account the loss 
function and benefiting from favorable learning guarantees. 
Section 5 provides a convergence analysis of the empirical 
equilibrium bidding function to the true equilibrium bid¬ 
ding function in a GSP. On its own, this result is of great 
interest as it justifies the common assumption of buyers 
playing a symmetric Bayes-Nash equilibrium. Finally, in 


For the most part, we will use the model defined by Sun 
et al. (2014) for GSP auctions with incomplete information. 
We consider N bidders competing for S slots with N > S. 
Let Vi G [0,1] and bi G [0,1] denote the per-click valuation 
of bidder i and his bid respectively. Let the position fac¬ 
tor Cs G [0,1] represent the probability of a user noticing 
an ad in position s and let Ci G [0,1] denote the expected 
click-through rate of advertiser i. That is is the probabil¬ 
ity of ad i being clicked on given that it was noticed by the 
user. We will adopt the common assumption that Cg > Cg+i 
(Gomes and Sweeney, 2014; Lahaie and Pennock, 2007; 
Sun et al., 2014; Thompson and Leyton-Brown, 2013). De¬ 
fine the score of bidder i to be Si = CiVi. Following Sun 
et al. (2014), we assume that si is an i.i.d. realization of a 
random variable with distribution F and density function 
/. Finally, we assume that advertisers bid in an efficient 
symmetric Bayes-Nash equilibrium. This is motivated by 
the fact that even though advertisers may not infer what the 
valuation of their adversaries is from repeated interactions, 
they can certainly estimate the distribution F. 

Define tt : s i—7r(s) as the function mapping slots to adver¬ 
tisers, i.e. 7r(s) = z if advertiser i is allocated to position s. 
For a vector x = (xi ,... ,xn) G K^, we use the notation 
:= x^(s). Finally, denote by the reserve price for 
advertiser i. An advertiser may participate in the auction 
only if bi > r^. In this paper we present an analysis of the 
two most common ranking rules (Qin et al., 2014): 

1. Rank-by-bid. Advertisers who bid above their re¬ 
serve price are ranked in descending order of their 
bids and the payment of advertiser 7r(s) is equal to 

2. Rank-by-revenue. Fach advertiser is assigned a qual¬ 
ity score qi := qi{bi) = eibili,->ri and the rank¬ 
ing is done by sorting these scores in descending 
order. The payment of advertiser 7r(s) is given by 

max(r(®\ 

In both setups, only advertisers bidding above their reserve 
price are considered. Notice that rank-by-bid is a particular 
case of rank-by-revenue where all quality scores are equal 
to 1. Given a vector of reserve prices r and a bid vector b, 
we define the revenue function to be 


Rev(r, b) 








q,(s + l)<'g(s).^(s)<g(s) 




Using the notation of Mohri and Medina (2014), we define 
the loss function 

L(r, b) = —Rev(r,b). 

Given an i.i.d. sample S = (bi,..., b„) of realizations of 
an auction, our objective will be to hnd a reserve price vec¬ 
tor r* that maximizes the expected revenue. Equivalently, 
r* should be a solution of the following optimization prob¬ 
lem: 

min Eb[U(r,b)]. (1) 

re[0.1]« 

3 PREVIOUS WORK 

It has been shown, both theoretically and empirically, that 
reserve prices can increase the revenue of an auction (My- 
erson, 1981; Ostrovsky and Schwarz, 2011). The choice 
of an appropriate reserve price therefore becomes crucial. 
If it is chosen too low, the seller might lose some revenue. 
On the other hand, if it is set too high, then the advertisers 
may not wish to bid above that value and the seller will not 
obtain any revenue from the auction. 


F, and to use these estimates in the above expression. 
There are two main drawbacks for this algorithm. The hrst 
is a standard problem of parametric statistics: there are no 
guarantees on the convergence of their estimation proce¬ 
dure when the density function / is not part of the paramet¬ 
ric family considered. While this problem can be addressed 
by the use of a non-parametric estimation algorithm such as 
kernel density estimation, the fact remains that the function 
/ is the density for the unobservable scores Si and there¬ 
fore cannot be properly estimated. The solution proposed 
by the authors assumes that the bids in fact form a perfect 
SNE and so advertisers’ valuations can be recovered using 
the process described by Varian (2007). There is however 
no justihcation for this assumption and, in fact, we show in 
Section 6 that bids played in a Bayes-Nash equilibrium do 
not in general form a SNE. 

4 LEARNING ALGORITHMS 

Here, we present and analyze two algorithms for learning 
the optimal reserve price for a GSP auction when advertis¬ 
ers play a symmetric equilibrium. 


Mohri and Medina (2014), Pardoe et al. (2005), and Cesa- 
Bianchi et al. (2013) have given learning algorithms that 
estimate the optimal reserve price for a second-price auc¬ 
tion in different information scenarios. The scenario we 
consider is most closely related to that of Mohri and Med¬ 
ina (2014). An extension of this work to the GSP auction, 
however, is not straightforward. Indeed, as we will show 
later, the optimal reserve price vector depends on the distri¬ 
bution of the advertisers’ valuation. In a second-price auc¬ 
tion, these valuations are observed since the corresponding 
mechanism is an incentive-compatible. This does not hold 
for GSP auctions. Moreover, for second-price auctions, 
only one reserve price had to be estimated. In contrast, our 
model requires the estimation of up to N parameters with 
intricate dependencies between them. 

The problem of estimating valuations from observed bids in 
a non-incentive compatible mechanism has been previously 
analyzed. Guerre et al. (2000) described a way of estimat¬ 
ing valuations from observed bids in a first-price auction. 
We will show that this method can be extended to the GSP 
auction. The rate of convergence of this algorithm, how¬ 
ever, in general will be worse than the standard learning 
rateofO(^). 

Sun et al. (2014) showed that, for advertisers playing an 
efficient equilibrium, the optimal reserve price is given by 
r, = — where r satishes 


1 - F(f) 

r = -. 

fir) 

The authors suggest learning r via a maximum likelihood 
technique over some parametric family to estimate / and 


4.1 DENSITY ESTIMATION ALGORITHM 


Eirst, we derive an extension of the algorithm of Guerre 
et al. (2000) to GSP auctions. To do so, we hrst derive a 
formula for the bidding strategy at equilibrium. Let Zs(v) 
denote the probability of winning position s given that the 
advertiser’s valuation is v. It is not hard to verify that 

Zs(v)=l^^~^yi-F(v))^-^Fr(v), 

where p = N — s. Indeed, in an efficient equilibrium, the 
bidder with the s-th highest valuation must be assigned to 
the s-th highest position. Therefore an advertiser with val¬ 
uation V is assigned to position s if and only if s — 1 bidders 
have a higher valuation and p have a lower valuation. 

Eor a rank-by-bid auction. Gomes and Sweeney (2014) 
showed the following results. 

Theorem 1 (Gomes and Sweeney (2014)). A GSP auction 
has a unique efficient symmetric Bayes-Nash equilibrium 
with bidding strategy /3 if and only if fi is strictly increasing 
and satisfies the following integral equation: 



( 2 ) 



(1 - F{v)y-^ rmpF^~\t)f{t)dt. 

Jo 


Furthermore, the optimal reserve price r* satisfies 

fir*) 


( 3 ) 





The authors show that, if the click probabilities Cg are suffi¬ 
ciently diverse, then, /? is guaranteed to be strictly increas¬ 
ing. When ranking is done by revenue. Sun et al. (2014) 
gave the following theorem. 

Theorem 2 (Sun et al. (2014)). Let (3 be defined by the pre¬ 
vious theorem. If advertisers bid in a Bayes-Nash equilib¬ 
rium then bi = Moreover, the optimal reserve price 

vector r* is given by r* = ^ where r satisfies equation (3). 

We are now able to present the foundation of our first algo¬ 
rithm. Instead of assuming that the bids constitute an SNE 
as in (Sun et al., 2014), we follow the ideas of Guerre et al. 
(2000) and infer the scores Si only from observables hi. 
Our result is presented for the rank-by-bid GSP auction but 
an extension to the rank-by-revenue mechanism is trivial. 

Lemma 1. Let vi,...,Vn be an Ltd. sample of valua¬ 
tions from distribution F and let bi = f3{vi) be the bid 
played at equilibrium. Then the random variables hi are 
i.i.d. with distribution G{b) = F(/3~^{b)) and density 

= p'{p-i(b)y Furthermore, 


yields the following integral equation; 


Sfl'i 


S=1 




N-l 
s—1 


’ dz 

j3~^{u) — {u)du 
du 

(1-0(6))^-!/' 


upGiuY ^{u)g{u)du. 


Taking the derivative with respect to b of both sides of this 
equation and rearranging terms lead to the desired expres¬ 
sion. □ 


The previous Lemma shows that we can recover the valua¬ 
tion of an advertiser from its bid. We therefore propose the 
following algorithm for estimating the value of T. 

1. Use the sample S to estimate G and g. 

2. Plug this estimates in (4) to obtain approximate sam¬ 
ples from the distribution F. 

3. Use the approximate samples to find estimates / and 
F of the valuations density and cumulative distribu¬ 
tion functions respectively. 

4. Use F and / to estimate r. 


v^ = P ^{bi) (4) 

Eti - G{b,)r-\pG{b,Y-^g{bY 


J2f=iG(,s-l){l-G{bYy '^g{bi)jl^'pG{u)P ^ug{u)du 
where Zs{h) := Zs{l3~^{b)) and is given by (^/)(1 — 


Proof. By definition, hi = is a function of only Vi. 

Since /3 does not depend on the other samples either, it 
follows that (&i)Ei must be an i.i.d. sample. Using the 
fact that /? is a strictly increasing function we also have 
G[b) = P{b, <b)= P{vi < p-yb)) = F{l3-\b)) and a 
simple application of the chain rule gives us the expression 
for the density g{b). To prove the second statement observe 
that by the change of variable v = (3~^{b), the right-hand 
side of (2) is equal to 



rP-fb) 

(1-G(6))*-7 pl3{t)F^-\t)fit)dt 

Jo 



{i-G{b)y-^ 


puG{u)P ^{u)g{u)du. 


The last equality follows by the change of variable t = 
f3{u) and from the fact that g{h) = The same 

change of variables applied to the left-hand side of (2) 


In order to avoid the use of parametric methods, a ker¬ 
nel density estimation algorithm can be used to estimate 
g and /. While this algorithm addresses both drawbacks 
of the algorithm proposed by Sun et al. (2014), it can be 
shown (Guerre et al., 2000) [Theorem 2] that if / is i? times 
continuously differentiable, then, after seeing n samples, 
11/ - /Iloo is in independently of the algo¬ 

rithm used to estimate /. In particular, note that for i? = 1 
the rate is in This unfavorable rate of conver¬ 

gence can be attributed to the fact that a two-step estimation 
algorithm is being used (estimation of g and /). But, even 
with access to bidder valuations, the rate can only be im¬ 
proved to ) (Guerre et al., 2000). Furthermore, 

a small error in the estimation of / affects the denominator 
of the equation defining r and can result in a large error on 
the estimate of T. 


4.2 DISCRIMINATIVE ALGORITHM 


In view of the problems associated with density estima¬ 
tion, we propose to use empirical risk minimization to find 
an approximation to the optimal reserve price. In particu¬ 
lar, we are interested in solving the following optimization 
problem: 

n 


min 

rG[0,l]^ 


^L(r,b,). 


(5) 


We first show that, when bidders play in equilibrium, the 
optimization problem (1) can be considerably simplified. 

Proposition 1. If advertisers play a symmetric Bayes-Nash 
equilibrium then 


min Eb[U(r,b)l= min Eb[L(r, b)!, 
rG[0.1]« rG[0.1] 












Figure 1: Plot of the loss function Li^s- Notice that the loss 
in fact resembles a broken “V” . 


where Qi := qi(bi) = etbi and 

^ s 

L{r, b) = - ^ Jfy (§^'*+^^lL 5 r(.+i)>^+rlL 5 r(,+i)<^<g-(,)) . 


Proof. Since advertisers play a symmetric Bayes-Nash 
equilibrium, the optimal reserve price vector r* is of 
the form r* = Therefore, letting D = {r|ri = 
r G [0,1]} we have minrg[o,i]iv Eb[i(r, b)] = 
minrgD Eb[T(r, b)]. Furthermore, when restricted to D, 
the objective function L is given by 

s 

s—1 


Thus, we are left with showing that replacing with 
in this expression does not affect its value. Let r > 0, since 
Qi = qitq.yr^ in general the equality does not 

hold. Nevertheless, if sq denotes the largest index less than 
or equal to S satisfying q^^°'> > 0, then q^®i > r for all 
s < So and gi®) = ^®\ On the other hand, for S' > s > sq, 
= l9(=)>r = 0- ThuS, 


S — 1 

sq 

S — 1 
sq 

S — 1 

= -L{r,h), 


) 

) 


which completes the proof. 


□ 


In view of this proposition, we can replace the challenging 
problem of solving an optimization problem in with 
solving the following simpler empirical risk minimization 
problem 


min 

rG[0,l] 


2=1 


mm 

rG[0.1] 




2 = 1 S = 1 


^s) ;^S+ 1 )^^ 

( 6 ) 


Algorithm 1 Minimization algorithm 

Require: Scores (5^*^), 1 < n, 1 < s < S. 

1 : Define m = nS' 

2: AA:=U:iiUti{p£\pl«^}; 

3: (ni,...,n 2 ,n) = Sort(A/'); 

4: Set di := (^ 1 ,^ 2 ) = 0 

5: Setdi=-ELiELif7^^!f^ 

6 : Set r* = — 1 and L* = 00 
7: for j = 2,..., 2m do 
8 : if rij-i = plf then 

9: di = di + ^plf; ^2 = ^2 - 

10 : else if Uj-i = p[l'^ then 

11 : ^2 = ^2 + !^ 

12: end if 

13: L = di — njd2', 

14: if L < L* then 

15: L* = L-,r* = ny, 

16: end if 

17: end for 
18: return r*; 


where L,,,(r,^®)),^®+i)) := 

). In order to efficiently minimize this 


i~(s+l). 


highly non-convex function, we draw upon the work of 
Mohri and Medina (2014) on minimization of sums of v- 
functions. 


Definition 1. A function F: —>■ K is a u-function if it 

admits the following form: 


Vir, 91 , 92 ) 

= -a’^^hr<g2-a’^^Vlq2<r<gi 


--a(3) 


qi<r<(l+T))qi , 


with 0 < r] < 00 constants satisfying 

ail) ^ Q.(2)g2, -a(2)9il^>o = (^91 - Un¬ 

der the convention that 0 • 00 = 0 . 


As suggested by their name, these functions admit a char¬ 
acteristic “V shape”. It is clear from Figure 1 that Lg i is a 
u-function with and 77 = 0 . 

Thus, we can apply the optimization algorithm given by 
Mohri and Medina (2014) to minimize ( 6 ) in 0{nS log nS) 
time. Algorithm 1 gives the pseudocode of that the adap¬ 
tation of this general algorithm to our problem. A proof 
of the correctness of this algorithm can be found in (Mohri 
and Medina, 2014). 

We conclude this section by presenting learning guarantees 
for our algorithm. Our bounds are given in terms of the 
Rademacher complexity and the VC-dimension. 

Definition 2. Let X be a set and let G := {g : X ^ be 
a family of functions. Given a sample S = {xi, ..., Xn) G 





















X, the empirical Rademacher complexity ofG is defined by 


6\s{G) = -E, 

n 


r 1 
sup — 

\-gGG n 


n 

i=l 


where Uis are independent random variables distributed 
uniformly over the set {—1,1}. 

Proposition 2. Let m = mini > 0 and Wt = X)s=i 
Then, for any (5 > 0, with probability at least 1 — ^ over 
the draw of a sample S of size n, each of the following 
inequalities holds for all r G [0,1]; 

1 ” _ 

E[L(r, b)] < - ^ L{r, b^) + (7(971, m, n, i5) (7) 

^ i=l 

1 ” _ 

- ^L(r, bi) < E[L(r, b)] + (7(971, m, n, 5), (8) 

^ Z = 1 


with highest bid and second highest bid (Mohri 

and Medina, 2014). Therefore, by Propositions 9 and 10 of 
Mohri and Medina (2014) we can write 


msjTZ) < -E, 

n 


sup 

'-r-G[0,l] 


i=l 


rcr,; 


+ 



2 log en 


which concludes the proof. □ 

Corollary 1. Under the hypotheses of Proposition 2, let r 
denote the empirical minimizer and r* the minimizer of the 
expected loss. Then, for any 5 > 0, with probability at least 
1 — 5, the following inequality holds: 


E[L(r, b)] — E[L(r*, b)] < 2(7^971, m, n. 



where C(971, m, n, 5) = ^ + 

Proof Let 4-: 5 sup^gjo ^ ^ b*) - 

E[L(r, b)]. Let 5* be a sample obtained from S by re¬ 
placing bi with b'. It is not hard to verify that |'I'(5) — 
Thus, it follows from a standard learning 
bound that, with probability at least 1 — 5, 

E[Z(r, b)] < ^ Z(r, b,) + ^s(7^) + 


Proof. By the union bound, (7) and (8) hold simultane¬ 
ously with probability at least 1 — 5 if 5 is replaced by 5/2 
in those expression. Adding both inequalities and using the 
fact that r is an empirical minimizer yields the result. □ 

It is worth noting that our algorithm is well defined whether 
or not the buyers bid in equilibrium. Indeed, the algorithm 
consists of the minimization over r of an observable quan¬ 
tity. While we can guarantee convergence to a solution of 
(1) only when buyers play a symmetric BNE, our algorithm 
will still hnd an approximate solution to 


where TZ = {Lr ; b i—> L(r,b)|r € [0,1]}. We pro¬ 
ceed to bound the empirical Rademacher complexity of the 
class TZ. For gi > 92 > 0 let L(r, 91,^2) = q 2 '^q 2 >r + 
^liji>r>q2- by definition of the Rademacher complexity 
we can write 


min Eb[L(r,b)], 
re [0,1] 

which remains a quantity of interest that can be close to (1) 
if buyers are close to the equilibrium. 


iAs{TZ) = — Eo 
n 


= 

n 


sup aiLr(hi 
re [0.1] 


sup X^cr, 


^ o n 

<-E<,[XI sup 

” '■s=ire[o.i] 


where tfs is the ((^-Lipschitz function mapping x ^^2;. 
Therefore, by Talagrand’s contraction lemma (Ledoux and 
Talagrand, 2011), the last term is bounded by 


V— E, sup 
^ nm rG[o.i] 




X]] cr*L(r, 5} 
2=1 


)=x:^^5.(7^), 


S=1 


whereas = ■ • ■, (gl'*^ and 7^ := 

{L(r, •,-)|'r € [0,1]}- The loss L(r, g^®Z in fact 

evaluates to the negative revenue of a second-price auction 


5 CONVERGENCE OF EMPIRICAL 
EQUILIBRIA 

A crucial assumption in the study of GSP auctions, includ¬ 
ing this work, is that advertisers bid in a Bayes-Nash equi¬ 
librium (Lucier et ah, 2012; Sun et ah, 2014). This assump¬ 
tion is partially justified by the fact that advertisers can in¬ 
fer the underlying distribution F using as observations the 
outcomes of the past repeated auctions and can thereby im¬ 
plement an efficient equilibrium. 

In this section, we provide a stronger theoretical justifica¬ 
tion in support of this assumption: we quantify the dif¬ 
ference between the bidding function calculated using ob¬ 
served empirical distributions and the tme symmetric bid¬ 
ding function in equilibria. For the sake of notation sim¬ 
plicity, we will consider only the rank-by-bid GSP auction. 

Let Sy — (ui,..., Vn) be an i.i.d. sample of values drawn 
from a continuous distribution F with density function /. 





















Assume without loss of generality that vi < ... < Vn and 
let V denote the vector dehned by = Vi. Let F denote 
the empirical distribution function induced by Sy and let 
F € K" and G S K" be dehned by F^ = F(vi) = ijn and 
G, = 1 - F,. 


are deferred to the Appendix. Let us now dehne the lower 
triangular matrix M(s) by: 




/AT _ 1\ nAF^AGf 

U-l/ s 


We consider a discrete GSP auction where the advertiser’s 
valuations are i.i.d. samples drawn from a distribution F. 
In the event where two or more advertisers admit the same 
valuation, ties are broken randomly. Denote by f3 the bid¬ 
ding function for this auction in equilibrium (when it ex¬ 
ists). We are interested in characterizing and in providing 
guarantees on the convergence of /3 to /? as the sample size 
increases. 

We hrst introduce the notation used throughout this section. 

Definition 3. Given a vector F S K", the backwards dif¬ 
ference operator A : K" —>■ K" is defined as: 

AFi = Fj - Fi_i, 


for i > j and 

N-s-l s-1 

= E E 

j—O k—O 


N-l 

j, k,N-l-j-kJ ■ 


Proposition 4. If the discrete GSP auction admits a sym¬ 
metric efficient equilibrium, then its bidding function f3 sat¬ 
isfies fi{vi) = /3j, where f3 is the solution of the following 
linear equation. 

M/3 = u, (10) 

with M = CsM(s) and = X]s=i ( - 


for i > 1 and AFi = Fi. 

We will denote AAF^ by A^F^. Given any fc S N and 
a vector F, the vector F^ is dehned as Ff = (F^)^. Let 
us now dehne the discrete analog of the function Zg that 
quantihes the probability of winning slot s. 

Proposition 3. In a symmetric efficient equilibrium of the 
discrete GSP, the probability Zg (v) that an advertiser with 
valuation v is assigned to slot s is given by 

Zsiv) 

Jfyf ^ 

\j, k, N—l—j — kJ {N — j — ’ 

ifv = Vi and otherwise by 

Zs{v)=(^ lim =: ?7(u), 

\ s — 1 y v'^v- 


To gain some insight about the relationship between (3 and 
(3, we compare equations (10) and (2). An integration by 
parts of the right-hand side of (2) and the change of variable 
G{v) = 1 — F{v) show that (3 satishes 


f ‘^tdt 

= £mdFF ( 11 ) 


On the other hand, equation (10) implies that for all i 


w = E' 


S = 1 


Mi^(s)/3i - 


N - 1\ nAG: 

S — I 




( 12 ) 

Moreover, by Lemma 2 and Proposition 10 in the Ap¬ 
pendix, the equalities — + 0(i) and 


where p = N — s. 

In particular, notice that (vi) admits the simple expres¬ 
sion 

which is the discrete version of the function Zg. On the 
other hand, even though Ss(ui) does not admit a closed- 
form, it is not hard to show that 

?,(..)= (":;)Ff_.Gr' + o(i). (9) 

Which again can be thought of as a discrete version of Zg. 
The proof of this and all other propositions in this section 


hold. Thus, equation (12) resembles a numerical scheme 
for solving (11) where the integral on the right-hand side is 
approximated by the trapezoidal rule. Equation (11) is in 
fact a Volterra equation of the hrst kind with kernel 

S = 1 ^ ^ 

Therefore, we could beneht from the extensive literature 
on the convergence analysis of numerical schemes for this 
type of equations (Baker, 1977; Kress et al., 1989; Linz, 
1985). However, equations of the hrst kind are in general 











Figure 2; (a) Empirical verification of Assumption 2. The 
blue line corresponds to the quantity max^ A/3j. In red we 
plot the desired upper bound for C = 1/2. 

ill-posed problems (Kress et ah, 1989), that is small pertur¬ 
bations on the equation can produce large errors on the so¬ 
lution. When the kernel K satisfies min^gjo^] K{t, t) > 0, 
there exists a standard technique to transform an equation 
of the first kind to an equation of the second kind, which is 
a well posed problem. Thus, making the convergence anal¬ 
ysis for these types of problems much simpler. The kernel 
function appearing in (11) does not satisfy this property and 
therefore these results are not applicable to our scenario. 
To the best of our knowledge, there exists no quadrature 
method for solving Volterra equations of the first kind with 
vanishing kernel. 

In addition to dealing with an uncommon integral equa¬ 
tion, we need to address the problem that the elements of 
(10) are not exact evaluations of the functions defining (11) 
but rather stochastic approximations of these functions. Fi¬ 
nally, the grid points used for the numerical approximation 
are also random. 

In order to prove convergence of the function /3 to /3 we 
will make the following assumptions 

Assumption 1. There exists a constant c > 0 such that 
f{x) > c for all X € [0,1]. 

This assumption is needed to ensure that the difference be¬ 
tween consecutive samples Vi — Vi-i goes to 0 as n —?- oo, 
which is a necessary condition for the convergence of any 
numerical scheme. 

Assumption 2. The solution j3 of (10) satisfies Vi, (3^ > 0 
for all i and maxigi^,.. A/3j < for some universal 
constant C. 


Figure 3; Approximation of the empirical bidding function 
P to the true solution fi. The true solution is shown in red 
and the shaded region represents the confidence interval of 
P when simulating the discrete GSP 10 times with a sam¬ 
ple of size 200. Where A = 3, ^ = 2, ci = 1, ca = 0.5 
and bids were sampled uniformly from [0,1] 

This is satisfied if for instance the distribution function F 
is twice continuously differentiable. We can now present 
our main result. 

Theorem 3. If Assumptions 1, 2 and 3 are satisfied, then, 
for any i5 > 0, with probability at least 1 — 5 over the draw 
of a sample of size n, the following bound holds for all 
i G [1, n]: 



where q{n, ^) = f log(nc/2<5) with c defined in Assump¬ 
tion 1, and where C is a universal constant. 


The proof of this theorem is highly technical, thus, we defer 
it to Appendix F. 

6 EXPERIMENTS 

Here we present preliminary experiments showing the ad¬ 
vantages of our algorithm. We also present empirical ev¬ 
idence showing that the procedure proposed in Sun et al. 
(2014) to estimate valuations from bids is incorrect. In con¬ 
trast, our density estimation algorithm correctly recovers 
valuations from bids in equilibrium. 

6.1 SETUP 


Since /3j is a bidding strategy in equilibrium, it is reason¬ 
able to expect that Vi > (3^ > 0. On the other hand, the 
assumption on A/3j is related to the smoothness of the so¬ 
lution. If the function P is smooth, we should expect the ap¬ 
proximation P to be smooth too. Both assumptions can in 
practice be verified empirically. Figure 2 depicts the quan¬ 
tity maxigi . , „ A^j as a function of the sample size n. 

Assumption 3. The solution P to (2) is twice continuously 
differentiable. 


Fet Fi and F 2 denote the distributions of two truncated log¬ 
normal random variables with parameters fii = log(.5), 
tJi = .8 and p ,2 = log(2), cr = .1; the mixture parameter 
was set to 1/2 . Here, Fi is truncated to have support in 
[0,1.5] and the support of F 2 — [0,2.5]. We consider a 
GSP with A = 4 advertisers with S' = 3 slots and position 
factors Cl = 1, C2 =, 45 and C3 = 1. Based on the results of 
Section 5 we estimate the bidding function P with a sample 
of 2000 points and we show its plot in Figure 4. We proceed 
to evaluate the method proposed by Sun et al. (2014) for 




















2.5 


True valuations 
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Figure 4; Bidding function for our experiments in blue and 
identity function in red. 

recovering advertisers valuations from bids in equilibrium. 
The assumption made by the authors is that the advertisers 
play a SNE in which case valuations can be inferred by 
solving a simple system of inequalities defining the SNE 
(Varian, 2007). Since the authors do not specify which SNE 
the advertisers are playing we select the one that solves the 
SNE conditions with equality. 

We generated a sample S consisting of n = 300 i.i.d. out¬ 
comes of our simulated auction. Since TV = 4, the effec¬ 
tive size of this sample is of 1200 points. We generated 
the outcome bid vectors b^,..., b„ by using the equilib¬ 
rium bidding function /3. Assuming that the bids constitute 
a SNE we estimated the valuations and Eigure 5 shows an 
histogram of the original sample as well as the histogram of 
the estimated valuations. It is clear from this figure that this 
procedure does not accurately recover the distribution of 
the valuations. By contrast, the histogram of the estimated 
valuations using our density estimation algorithm is shown 
in Eigure 5(c). The kernel function used by our algorithm 
was a triangular kernel given by K{u) = (1 — |m|)1|„|<i. 
Eollowing the experimental setup of Guerre et al. (2000) 
the bandwidth h was set to h = l.OOirn^/®, where a de¬ 
notes the standard deviation of the sample of bids. 

Einally, we use both our density estimation algorithm and 
discriminative learning algorithm to infer the optimal value 
of r. To test our algorithm we generated a test sample of 
size n = 500 with the procedure previously described. The 
results are shown in Table 1 . 


Density estimation 

Discriminative 

1.42 ±0.02 

1.85 ±0.02 


Table 1; Mean revenue for our two algorithms. 

7 CONCLUSION 

We proposed and analyzed two algorithms for learning op¬ 
timal reserve prices for generalized second price auctions. 
Our first algorithm is based on density estimation and there¬ 
fore suffers from the standard problems associated with 
this family of algorithms. Eurthermore, this algorithm is 
only well defined when bidders play in equilibrium. Our 
second algorithm is novel and is based on learning theory 



Eigure 5; Comparison of methods for estimating valuations 
from bids, (a) Histogram of true valuations, (b) Valuations 
estimated under the SNE assumption, (c) Density estima¬ 
tion algorithm. 

guarantees. We show that the algorithm admits an efficient 
0{nS\og{nS)) implementation. Eurthermore, our theo¬ 
retical guarantees are more favorable than those presented 
for the previous algorithm of Sun et al. (2014). Moreover, 
even though it is necessary for advertisers to play in equi¬ 
librium for our algorithm to converge to optimality, when 
bidders do not play an equilibrium, our algorithm is still 
well defined and minimizes a quantity of interest albeit over 
a smaller set. We also presented preliminary experimental 
results showing the advantages of our algorithm. To our 
knowledge, this is the first attempt to apply learning algo¬ 
rithms to the problem of reserve price selection in GSP auc¬ 
tions. We believe that the use of learning algorithms in rev¬ 
enue optimization is crucial and that this work may preface 
a rich research agenda including extensions of this work to 
a general learning setup where auctions and advertisers are 
represented by features. Additionally, in our analysis, we 
considered two different ranking rules. It would be inter¬ 
esting to combine the algorithm of Zhu et al. (2009) with 
this work to learn both a ranking rule and an optimal re¬ 
serve price. Einally, we provided the first analysis of con¬ 
vergence of bidding functions in an empirical equilibrium 
to the true bidding function. This result on its own is of 
great importance as it justifies the common assumption of 
advertisers playing in a Bayes-Nash equilibrium. 
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A THE ENVELOPE THEOREM 


The envelope theorem is a well known result in auc¬ 
tion mechanism design characterizing the maximum of a 
parametrized family of functions. The most general form 
of this theorem is due to Milgrom and Segal (2002) and we 
include its proof here for completeness. We will let X be an 
arbitrary space will consider a function /: A x [0,1] —>■ M 
we define the envelope function V and the set valued func¬ 
tion X* as 


V(t) = sup f(x,t) and 

x*(t) = {xexif(x,t) = v(t)}. 

We show a plot of the envelope function in figure 6. 

Theorem 4 (Envelope Theorem). Let f be an absolutely 
continuous function for every x € X. Suppose also that 
there exists an integrable function b: [0,1] — > K+ such that 
for every x G X, ‘^(x,t) < b{t) almost everywhere in t. 
Then V is absolutely continuous. If in addition f{x, •) is 
differentiable for all x G X, X*(t) f 0 almost everywhere 
on [0,1] and x*(t) denotes an arbitrary element in X*(t), 
then 

V{t) = V{0) + ^{x*{s),s)ds. 


Proof By definition of V, for any f', t" G [0,1] we have 


lV(f')-V(f)l< suplf(x,f')-f(x,t')l 

xGX 


= sup 
xex 




This easily implies that V(t) is absolutely continuous. 
Therefore, V is differentiable almost everywhere and 
V(t) = F(0) -I- fgV'(s)ds. Finally, if f{x,t) is differ¬ 
entiable in t then we know that V'{t) = for 

any x*{t) G X*{t) whenever Vft) exists and the result 
follows. □ 


B ELEMENTARY CALCULATIONS 


We present elementary results of Calculus that will be used 
throughout the rest of this Appendix. 

Lemma 2. The following equality holds for any k G N.' 




-F?-i 
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2-1 
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Figure 6: Illustration of the envelope function. 


Proof The result follows from a straightforward applica¬ 
tion of Taylor’s theorem to the function h{x) = x^. Notice 
that F? = hlf/n), therefore; 




n 

= AA, 


n 


for some (fi G [(i — l)/nf/n]. Since h"{x) = k(k — 
l)x^~^, it follows that the last term in the previous expres¬ 
sion is in {i!0{1!u?). The second equality can be 
similarly proved. □ 


Proposition 5. Let a, 6 G M and N > 1 be an integer, then 



aPh^ 

j + 1 


(a + b)^+^ - b^+^ 

a(X + l) 


(13) 


Proof The proof relies on the fact that ^ Pdt. 

The left hand side of (13) is then equal to 


(a -f b)^+^ - 6^+1 


0 j=0 


i{N + l) 


□ 


Lemma 3. If the sequence > 0 satisfies 

Ui < S Vi < r 

m < A + B ^ aj Vi > r. 


and 














Then tti < {A + rSB){l + 
Vi > r. 


< {A + 

This lemma is well known in the numerical analysis com¬ 
munity and we include the proof here for completeness. 

Proof. We proceed by induction on i. The base of our in¬ 
duction is given by i = r -f 1 and it can be trivially verified. 
Indeed, by assumption 

<ir+i A + rSB. 

Let us assume that the proposition holds for values less than 
i and let us try to show it also holds for i. 

r i— 1 

Ui B ^ ^ Q,j B ^ ^ Gj 

j^l j^r+1 

i-1 

<A + rB6 + B {A + rB6){l + By-^-^ 

j=r+l 

i—r—2 

= A + rBS -f (A -f rBS)B E 

3=0 

= A + rB5 + {A + 

B 

= {A + rB5){lpBy-'^-^. 

□ 

Lemma 4. Let Wq ■ [e, oo) —> M denote the main branch 
of the Lambert function, i.e. lTo(a:)e'^‘’*'^^ = x. The fol¬ 
lowing inequality holds for every x € [e, oo). 

log(x) > IVo(x). 

By definition ofW^ we see that Wo(e) = 1. Moreover, Wq 
is an increasing function. Therefore for any x € [e, oo) 

Woix) > 1 

^Wq{x)x > X 

=>Wo{x)x > W'o(x)e'^'>(^) 

The result follows by taking logarithms on both sides of the 
last inequality. 

C PROOF OF PROPOSITION 4 

Here, we derive the linear equation that must be satisfied 
by the bidding function /3. For the most part, we adapt 
the analysis of Gomes and Sweeney (2014) to a discrete 
setting. 

Proposition 3. In a symmetric efficient equilibrium of the 
discrete GSP, the probability % (y) that an advertiser with 


valuation v is assigned to slot s is given by 
Zs{v) 

N-1 \ 

k^o (N -j - fc)n^-i--t-'=’ 

ifv = Vi and otherwise by 

Zs{v)={^ lim F{v'Y{l-F(y)y-'^ =:zf{y), 

\S — 1 J v'^v- 

where p = N — s. 

Proof. Since advertisers play an efficient equilibrium, 
these probabilities depend only on the advertisers’ valua¬ 
tions. Let Aj^k{s, v) denote the event that j buyers have a 
valuation lower than v, k of them have a valuation higher 
than V and iV— 1— j — fca valuation exactly equal to v. 
Then, the probability of assigning s to an advertiser with 
value V is given by 

N-ss-l , 

EE]v373 

j=0 fc=o ■' 

The factor appears due to the fact that the slot is 

randomly assigned in the case of a tie. When v = Vi, this 
probability is easily seen to be; 

7V-1 \ jeeL 

j, k, N — l—j — kJ n^-^-3-k 

On the other hand, if u € {vi-i,Vi) the event Aj^k{s,v) 
happens with probability zero unless j = N — s and k = 
s — 1. Therefore, (14) simplifies to 

= lim F{v')P{l-F{v)y-\ 

y S—1 J v'—fv~ 

□ 

Proposition 6. Let E[P^^(u)] denote the expected payoff 
of an advertiser with value v at equilibrium. Then 

s 

Proof. By the revelation principle (Gibbons, 1992), there 
exists a truth revealing mechanism with the same expected 
payoff function as the GSP with bidders playing an equi¬ 
librium. For this mechanism, we then must have 

s 

V G are max > CsZs{v)v — E[P^'®(u)l. 

®G[0.1] jyy 








By the envelope theorem (see Theorem 4), we have where the second equality follows from an application of 

the binomial theorem and Proposition 5. On the other hand 
^ ^ /■”• if z = 7 this probability is given by; 

y^^CsZs{vi)vi-E[P^^(Vj)] = -E[P-^-^(0)]+y^ I Zs{t)dt. 

N-s-l s-1 / AT 1 \ TTlJ r^k 


Since the expected payoff of an advertiser with valuation 0 
should be zero too, we see that 


E[P^^{Vi)] = CsZsiVi)Vi - 




Using the fact that Zs{t) = {vi) for t G {vi-i,Vi) we 
obtain the desired expression. □ 

Proposition 4. If the discrete GSP auction admits a sym¬ 
metric efficient equilibrium, then its bidding function fS sat¬ 
isfies fi{vi) = f3^, where (3 is the solution of the following 
linear equation: 

Mf3 = u. 

where M = X)f=i CsM(s) and 

S i 

^ XI {^CsZs{Vi)Vi - ^ . 


s=l 




E E 

j—0 k—0 


N-l 




j,k, N — I — j — kj {N — j — k)n^~'^~I~^ 


It is now clear that M(s)ij- = Pr(A(s, Ui, u^)) for 
i > j- Finally, given that in equilibrium the equality 
E[P^'®(w)] = E[P^(z))] must hold, by Proposition 6 , we 
see that f3 must satisfy equation (lO). □ 

We conclude this section with a simpler expression for 
Mii(s). By adding and subtracting the term j = N — s 
in the expression dehning Mii(s) we obtain 


s-l . 

Mjj(s) = Zs{vi) - ^ ( 
k=0 ^ 

IE 

= ZsiVi) + 


iV-l 


Ff iGf 


N — s,k,s—l — kJ (s — 

S-l / -.\ T^p 


/c=l 


A^- I 

S — I 


^ i-l 


S — I 

k 

nAGi 


FtiGl 


(s — k)n 


s—l — k 


(15) 


Proof Let E[P^(ui)] denote the expected payoff of an ad¬ 
vertiser with value Vi when all advertisers play the bidding 
function fi. Let A(s, Vi, vj) denote the event that an adver¬ 
tiser with value Vi gets assigned slot s and the s-th highest 
valuation among the remaining — 1 advertisers is Vj. If 
the event A(s, Vi, vf) takes place, then the advertiser has a 
expected payoff of Csfi{vj). Thus, 

S i 

E[P'^(t;i)] = E E Pr{A{s, Vi, vj)). 

S=1 j=l 


where again we used Proposition 5 for the last equality. 

D HIGH PROBABILITY BOUNDS 

In order to improve the readability of our proofs we use a 
hxed variable C to refer to a universal constant even though 
this constant may be different in different lines of a proof. 

Theorem 5. (Glivenko-Cantelli Theorem) Let ui,..., 
be an i.i.d. sample drawn from a distribution F. If F de¬ 
notes the empirical distribution function induced by this 
sample, then with probability at least 1 — 5 for allvGE 


In order for event A(s, Vi, vj) to occur for i f j, N — s 
advertisers must have valuations less than or equal to Vj 
with equality holding for at least one advertiser. Also, the 
valuation of s — 1 advertisers must be greater than or equal 
to Vi. Keeping in mind that a slot is assigned randomly 
in the event of a tie, we see that A{s,Vi,Vj) occurs with 
probability 


N-s-ls-l .. 

E E f!: 

1^0 k-O 


N-l 

s —1 

A^-1 
s —1 


5—1 
N-s-l 

E 

1=0 


s —1 

k 


N-s\ F' 


■f-i 


G 


S—l — k 


N-s-l 


N-s\ F'_ 


I 


n 

s-l 


{k + l)n^ 


^ i-l 

AT-s-Z 






- j-i 


n\ 


s—l — k 

■i _ 

[k + l)n^ 
GU-Gf) 


s-l\ G 
k 


\F{v)-F{v)\<C^l^^i^. 

Proposition 7. Let Xi ,..., Xn be an i.i.d sample from 
a distribution F supported in [0,1]. Suppose F admits a 
density f and assume f{x) > c for all x G [0,1]. If 
XP\ ... ,XP^'> denote the order statistics of a sample of 
size n and we let = 0 , then 

Pr( max X^^ - X^^-P > e) < -e-«"/2. 
Te{i.....n} e 

In particular, with probability at least 1 — 5: 

max XW - < -g(n,(5), (16) 

iG{l,...,n} n 


(N -l\ nAFjAGi 

[s-l) ^ 


whereq(n,5) = f logoff). 














Proof. Divide the interval [0,1] into k < [2/e] sub¬ 
intervals of size Denote this sub-intervals by Ii,..., 
with Ij = [aj,bj] . If there exists i such that — 
x(*-i) > e then at least one of these sub-intervals must 
not contain any samples. Therefore: 

Pr( max > e) < Pr(3 j s.t X, ^ I, Vi) 

r2/v 

< ^ Pr(X, i I, V*). 
i=i 

Using the fact that the sample is i.i.d. and that F{b}.) — 
F{ak) > fix){bk - ak) > c{bk - ak), we 

may bound the last term by 

(^) (1 - ^ /(I - 

e 

The equation = 6 implies e = 

where Wq denotes the main branch of the Lambert func¬ 
tion (the inverse of the function x i-->^ xe^). By Lemma 4, 
for X G [e, oo) we have 


^ \/iog 2 /^(z-r\/n) ^ since i > it follows that 

F{Q^-^ < F{v,Y-^ < c(^log(2/5)(P-i)/2). 

Replacing this term in our original bound yields the result. 

□ 


Proposition 8. Let ■ [0,1] —>■ K a twice continuously 
differentiable function. With probability at least 1 — (5 the 
following bound holds for all i > i/n 


i=i 

, F log(2/,5)J’/2 


< C 


nP yLn 


q{n,5/‘2.f 


and 


i-2 


Y{t)pFP - ^r/(uj)AFj 

i=i 

, F log(2/5)P/2 


< C- 


nP \/n 


q{n,5/2Y 


log(x) > Wo{x). 
Therefore, with probability at least 1 — 

max < — 

iG{l,...,n} nC 


6 


(17) 


Proof. By splitting the integral along the intervals 
we obtain 


/3cn\ 

rFPm-j:^u^-^ 

K'W)' 



□ 


The following estimates will be used in the proof of Theo¬ 
rem 3. 

Lemma 5. Let p > \ be an integer. If i > s/n , then for 
any t G [vi-i^Vi] the following inequality is satisfied with 
probability at least 1 — <5." 


jP-i 


\Fnv)-FU\<C— 


\og{2/5y- 


nP~ 


-q{n,2/6) 


Proof. The left hand side of the above inequality may be 
decomposed as 


<\T. r +FP{vy{v,-v,.y 

i=i 


(18) 


q{nY/2.). 


By Lemma 5, for t G we have: 

nP ^ yjpt 

Using the same argument of Lemma 5 we see that for i > 
s/n 

F(v,r < c(bXfM.Y 

Therefore we may bound (18) by 


\F^{v)-YU\ 

< \FP{v) - FP{v,_^)\ + \FP{v,_^) - F/_J 

< P\F{f.Y-^f{Q\{v, - +pFrl{F{v,-i) - F,-i) 


<ci^-^f{yy-^ + c^^ 

n nP ^ 


\og{2/5) 


n 


for some Q G {vi-i,Vi). The second inequality fol¬ 
lows from Taylor’s theorem and we have used Glivenko- 
Cantelli’s theorem and Proposition 7 for the last inequal¬ 
ity. Moreover, we know F{vi) < F^ -f \J< 


C 


F-^ \og{2/5)- 


i-l 


nP 


-1 


/n 


■{q{n,5/‘F)^'> 


f=i 


i{v.i, - Vi-l)^J\og{2/5) 


We can again use Proposition 7 to bound the sum by 
^q{n,S/2) and the result follows. In order to proof the 
second bound we first do integration by parts to obtain 



Y{t)pFP ^f{t)dt = Y{vi)FP{vi) 



Y' {t)FP{t)dt. 





























Similarly 


From equation (19) and inequalities (20) and (21) we can 
thus infer that 


2—2 2—2 

i=i i=i 

Using the fact that ip is twice continuously differentiable, 
we can recover the desired bound by following similar steps 
as before. □ 


Proposition 9. With probability at least 1 — 5 the following 
inequality holds for all i 


{s-l)G{viy-^-n- 




< c 


log(l/5) 


\pG{v,y-^F{v,f2nm,ys)\ 

-Frii) 

1 

nP ^ 


The desired bound follows trivially from the last inequality. 

□ 


E SOLUTION PROPERTIES 


Proof By Lemma 2 we know that 

Therefore the left hand side of (9) can be bounded by 

(s-l)|G(u.)^-2-GrV^. 

The result now follows from Glivenko-Cantelli’s theorem. 

□ 

Proposition 10. With probability at least 1 — 5 the follow¬ 
ing bound holds for all i 


A standard way to solve a Volterra equation of the first 
kind is to differentiate the equation and transform it into 
an equation of the second kind. As mentioned before this 
may only be done if the kernel defining the equation satis¬ 
fies K{t,t) > c > 0 for all t. Here we take the discrete 
derivative of ( 10 ) and show that in spite of the fact that 
the new system remains ill-conditioned the solution of this 
equation has a particular property that allows us to show 
the solution /3 will be close to the solution j3 of surrogate 
linear system which, in turn, will also be close to the true 
bidding function (3. 

Proposition 11. The solution (3 of equation (10) also sat¬ 
isfies the following equation 

(iM/3 = du (22) 


N-1 
s — 1 


pG(vy^-^F(v,y 


2 nMij(s) 


^ (log(2/5))^ 

“ nP~^ yTi 


q{n,6/2). 


where dMy = and d\ii = 

Furthermore, for j < i — 2 


s 

dWLij ^ ^ Cg 

S = 1 


/A-l\ nAFjA^Gf 

U-l/ s 


W-i- 


Proof By analyzing the sum defining Mii(s) we see that 
all terms with exception of the term given by j = N — s — 1 
and k = s — 1 have a factor of . Therefore, 

nP ^ 77,^ ’ 


and 

s 

{Vi))+Vi-i{zf {Vi)-Zs{Vi-i))). 

2=1 


Furthermore, by Theorem 5 we have 



(19) 


Proof It is clear that the new equation is obtained from 
(10) by subtracting row i—1 from row i. Therefore (3 must 
also satisfy this equation. The expression for follows 
directly from the definition of . Finally, 


|Gr'-G(u,)^-'| <G 

Similarly, by Lemma 5 


log(2/5) 

n 


( 20 ) 


FYl-nv^Y-^\c< — 


F ^ (log(2/5))'’2" 


nP~ 


‘din, 6/2). 

( 21 ) 


ys{Vi)Vi Yj)(Vj - Vj-i) 

i=i 

2-1 

- (zg{Vi-i)Vi-i - - Uj-i)) 

i=i 

= Vi{zs{v^) -zf{v^)) 

-I- {Vi)Vi - Zs{Vi-i)Vi-i - zf {Vi){Vi - Vi-i). 






















Simplifying terms and summing over s yields the desired 
expression for dui. □ 

A straightforward bound on the difference |/3j — /3(ui) | can 
be obtain by bounding the following quantity: difference 

i i 

dMij{l3{vi) - f3i)='^ dMij^{vi) - dut, (23) 
i=i f=i 

and by then solving the system of inequalities debning 
Ci = \(3{vi) — (3i\. In order to do this, however, it is al¬ 
ways assumed that the diagonal terms of the matrix satisfy 
mini nd'M.ii > c > 0 for all n, which in view of (19) does 
not hold in our case. We therefore must resort to a different 
approach. We will hrst show that for values of i < the 
values of are close to Vi and similarly P{vi) will be close 
to Vi- Therefore for i < we can show that the differ¬ 
ence \l3(vi) — Pj\ is small. We will see that the analysis 
for i ^ is in fact more complicated; yet, by a clever 
manipulation of the system ( 10 ) we are able to obtain the 
desired bound. 

Proposition 12. Ifcs > 0 then there exists a constant C > 
0 such that: 

s—1 

Proof. By dehnition of Mii(s) it is immediate that 

with Cs = Csp(y~i^)- The sum can thus be lower bounded 
as follows 


■ ^ I I /i — \ \ N-2 

Y, CsM(s),, > — max |Ci [—) 


(24) 


( \ N—2 / \ iv—ii —i / \ (5 —i 

>Cs(Y^) (l-^) ,we 

have > 1 “ with K = (C'i/C's)i/(®-i). Which 

holds if and only if i > In this case the max term 

of (24) is easily seen to be lower bounded by Ci{K/K -f 
1)^“^. On the other hand, if i < then we can lower 

/ . S N-S-1 

bound this term by CsiKjK -f . The 

result follows immediately from these observations. □ 

Proposition 13. For all i and s the following inequality 
holds: 


N-S-l 


S-1 


|dM,,(s)-dM,,,_i(s)| <C— 


;p-2 2 


Proof From equation (19) we see that 


< 


< 





pF^iGr' 

AYUnAGt 

/ V2 

n 

s 


s—J 

^ 1 nAFf.iAGf 


n 


C^(-] 


A repeated application of Lemma 2 yields the desired re¬ 
sult. □ 

Lemma 6. The following holds for every s and every i 


Uv^)-Kiv.) = M,,(s)-( '■ ; 


N-l 

s—1 




and 

Zf(Vi) - Zs(Vi-i) 

= M(s)i,j_i - M(s)j_i,*_i - n 


N-l 

s—1 


N-l 

s—1 

AG 


i-2 


A^Gf 


Proof. From (15) we know that 

/ N — 1 \ AG^ 

%{vi)-z;{vi) = Mii(s)-^^_^ jnFf_i—^-z7(ui). 

By using the definition of ff{vi) we can verify that the 
right hand side of the above equation is in fact equal to 


Mij(s) - 




nAG 


-+GtZl). 


The second statement can be similarly proved 

z7 (Vi) - ZsiVi-i) = zf (Uj) - M(s)i_i,j_i 
/at _ 1 \ AG ^ 

+ nf +M(s)m-i -M(s),,,_i. 


( 25 ) 


On the other hand we have 

AGf i 




- M(s),,i_i 


= n 


n 


2 qr^2 


N-l 

s—1 

N-l 

s—1 






i-2 




































By replacing this expression into (25) and by definition of where again we used the definition of dM in the last equal¬ 
ity. By replacing this expression in the previous chain of 
equalities we obtain the desired result. □ 


Z^{Vi) - Zs{Vi-i) 

= - n 


N-1 

s —1 


^ri(Gri 


N-l 

s —1 

AG 


i-2 


A^Gf 


Corollary 3. Let p denote the vector defined by 


p-eh::; 


N-l\ nA^G? 


i=i 


□ 


+ CsAvi 


N-l 

s —1 


2-1 


{GtZl 


nAG^ 


Corollary 2. The following equality holds for all i and s. 
dUi = Vi{Zs{v^) - zf (uj)) -I- v^-l{zf {v^) - Zsiv^-l)) 

i-2 

= VidMii{s) + Vi-idMi^i_i{s) + ^ dMij{s)vj 

j=i 

i-2 

f=i 

f^s-i , ^AGf 


N-l\ nA^G 
s—1 


- {vi - v,_i 


s 

N-l 


s —1 


Proof. From the previous proposition we know 

V^{Zs{Vi) -Zf{Vi)) +Vi_i{zf{Vi) - ZsiVi^i)) 

= ViMii{s) + Vi-i{M{s)i^i-i - 


- Vi-in 


N-l 

s —1 


^ i-2 


A^Gf 


Ifijj = V — /3, then tp solves the following system of equa¬ 
tions: 

dMip = p. (26) 

Proof. It is immediate by replacing the expression for du^ 
from the previous corollary into ( 22 ) and rearranging terms. 

□ 

We can now present the main result of this section. 

Proposition 14. Under Assumption 2, with probability at 
least 1 — <5, the solution 'ip of equation (26) satisfies xp^ < 
C^q{n,S). 

Proof. By doing forward substitution on equation (26) we 
have: 

-f dM.ii^p^ 

i-2 


/ (r.s-1 

+ (^.-1 - J (g,_i + 

= UidMji(s) -I- Vi_idMi^j_i(s) - Vi-in 
/ nAGf\ 


A -1 

s —1 


= Pi -f ^dMy t/) 

i=i 

,As=P.+i; 


i-2 


tiA^G^ . p I 

^=1 1=1 


(27) 


where the last equality follows from the definition of dM. 
Furthermore, by doing summation by parts we see that 


Vi-in 


= Vi-2 


N-l 
s — 1 
A -1 
s —1 


pP 

^ i-2 


pP 

^ i-2 


A^Gf 

s 

nA^Gf 


-I- {Vi-i - Vi- 2 ) 
N-l\ nA^G, 


s 

N-l 

s — 1 

i-2 


Fr _2 


nA^Gf 


i-2 


S—1 

-f (Ui_i - Vi_2) 


i=i 

nA^Gf 


j=i 

W-l 

s —1 


pP _ 
^ i-2 


— dMij 

1=1 


N-l\ nA^G 


s —1 


s 

i-1 


A repeated application of Lemma 2 shows that 

1 jN-S i 

P* < Avj, 

1=1 

which by Proposition 7 we know it is bounded by 

Y {N-S-l j2 

Pi < G- ^g(»-,d). 

n n” s i ji2 

Similarly for j < i — 2^^ have 

nA^G® 1 1 

s •' n ^ ^ n 

Finally, Assumption 2 implies that ^p > 0 for all i and since 
dMi_i_i > 0 , the following inequality must hold for all i: 

dMiiip^ < dMi_j_it/;j_i + dMuip^ 




<cL 


1=1 


n n 


N-S-l , „'2 
N-S-1 




1=1 
























In view of Proposition 12 we know that dM-u > 

- _ -N-S-l 

C ^ ^ therefore after dividing both sides of the in- 

equality by dMa, it follows that 




Applying Lemma 3 with A = r = 0 and B = ^ we 
arrive to the following inequality: 


ipi < C^q{n,S)e‘^''^ < C'^q{n,S). 


□ 


We now present an analogous result for the solution /3 of 
(2).Let Cs = Cs {^Zi) and define the functions 


lim„_>.o Fs(w) = 0 , it is not hard to see that 

-m 


= lim 

V —>-0 


= lim 


= lim 

V —>-0 


SLi Gs(u) Fs{t)dt + G;,(u) 

/M(Ef=i ^ Jo /J' 

I2Li^s(v)F'^(v) 

Eti F'Av) 


Since the smallest power in the definition of Fg is attained 
at s = S', the previous limit is in fact equal to: 


lim 
1?—>-0 


/(^^)(/o“ Fs(t)dt + fg jP(t)F's(t)dt^ 


= lim 
1?—>-0 


F's(^) 


(JV-S)F^-s-i(v) 


Fg(u) = CsF^-^(v) Gg(u) = Givy-Z 

It is not hard to verify that Zs(v) = Fs(v)Gs(v) and that 
the integral equation ( 2 ) is given by 

s s 

^ r t{Fyt)Gyt)Ydt = ^Gyv) r pit)F'yt)dt 

s=l -^0 Jo 

(28) 

After differentiating this equation and rearranging terms we 
obtain 


s 

0 = (u-/3(u))^Gg(u)F'g(u) 

+ / l^i'^)K(.t)dt + vG'^{v)Fs{v) 

s=i do 

s 

= {v - /3{v))'^Gs{v)F'yv) 

S^l 

+ ^ g ;( u ) f\t-m)Kit)dt + G'yv) f\sm, 

s=l do Jo 

where the last equality follows from integration by parts. 
Notice that the above equation is the continuous equivalent 
of equation (26). Letting y{v) := v — P{v) we have that 


Using L’Hopital’s rule and simplifying we arrive to the fol¬ 
lowing: 


yiO) = — lim 


F^{v) 


v^o (N - S){N - S - l)f{v) 


y{v)F{v) 

(N-S-l) 


Moreover, since ip is a continuous function, it must be 
bounded and therefore, the previous limit is equal to 0. Us¬ 
ing the same series of steps we also see that: 


- i>'{0) 

= limiL) 

u—fO V 

/; F^-^(t)dt + f;(N-s)ip(t)F^-^-yt)f(t)dt 

v-To v(N - S)F^-^-yv) 


By L’Hopital’s rule again we have the previous limit is 
equal to 


F^-^(v) + (N- S)ip(v)F 


N-S-l 


(v)f(v) 


vTo (NS)(N-S-l)F^-2(v)f(v)v + (N-S)F^-s-yv) 

(30) 

Furthermore, notice that 


F^-^(v) + (N- S)ip(v)F^-^-\v)f(v) 
D-i-o (N — S)(N — S — l)F^~^~'^(v)f(v)v 

y __ I y(v)F(v) 

v^o(N-S)(N-S-l)f(v)v (N-S-l)v 


Where for the last equality we used the fact that 
lim„_>o dddyJ- = f(0) and '0(0) = 0. Similarly, we have: 


ip(v) = - 


Ef=i lo Fs(t)dt + G((v) ip(t)F((t)dt 


Es=iG 0 u)F((u) 

Since lim„_>o Gg(u) = lim„^o Gg(u)//(u) = 


lim 

u —>-0 


F 


N-S 


(v) + (N- S)ip(v)F 


<N-S-1 


{'i’)f(v) 


(N - S)FN-s-^(v) 
F(v 


= lim 


0 N-S 


+ ip(v)f(v) = 0 


(29) 
1 and 























Since the terms in the denominator of (30) are positive, the 
two previous limits imply that the limit given by (30) is in 
fact 0 and therefore '0^(0) = 0. Thus, by Taylor’s theorem 
we have |0(u)| < Cv^ for some constant C. 

Corollary 4. The following inequality holds with probabil¬ 
ity at least 1 — 5 for all % < 

10 * - < C^qin.b). 

Jn 


Proof. Follows trivially from the bound on 0(u), Proposi¬ 
tion 14 and the fact that ^ □ 

— y/n 


Having bounded the magnitude of the error for small values 
of i one could use the forward substitution technique used 
in Proposition 14 to bound the errors = |0j — 0(ui)|. 
Nevertheless, a crucial assumption used in Proposition 14 
was the fact that 0j > 0. This condition, however is not 
necessarily verified by e^. Therefore, a forward substitu¬ 
tion technique will not work. Instead, we leverage the fact 
that is in 0 (;^) and show that 

the solution 0 of a surrogate linear equation is close to 
both 0 and 0 implying that 0 ^ and 0 ('Ui) will be close 
too. Therefore let dM' denote the lower triangular ma¬ 
trix with dM' j = dM-i j for j < i — 2, dM' = 0 
and dM'j = 2d'M.ii. Thus, we are effectively removing 
the problematic term dMi_i_i in the analysis made by for¬ 
ward substitution. The following proposition quantifies the 
effect of approximating the original system with the new 
matrix dM'. 

Proposition 15. Let 0 be the solution to the system of 
equations 

dM'0 = p. 

Then, for all i G {1,... ,n\ it is true that 


10 * -01 < ( 


^ q{n,6) 

Jn 


.,C 


Proof We can show, in the same way as in Proposition 14, 
that 0j < 0^2 qin, 5) with probability at least 1 — d for all 
i. In particular, for i < it is true that 


By using the definition of dM' we see the above equations 
hold if and only if 


*-2 

2dMi*0, = p, - ^ dM*j 0j 

i-2 

2dMii0, = dM*i0j -f Pi - dMi^*_i - ^ dMij0^-. 

i=i 

Taking the difference of these two equations yields a recur¬ 
rence relation for the quantity ei = 0 i — 0 i. 

i-2 

2dMuei = dM*i0j - dM*,i_i0,_i - ^ dMijCj. 


Furthermore we can bound dMii0j — dMi^i_i0j_]^ as fol¬ 
lows: 


|dMi*0, - dMi,*_i0j_0 
< |dMii - dM*,i_i|0j_i -f 10, 


< C 


d q{n,S) C 


nP 


dMi, 


/n 


0 i_ 0 dMi,. 


Where the last inequality follows from Assumption 2 
and Proposition 13 as well as from the fact that 0j < 
^q{n,S). Finally, using the same bound on dMij as in 
Proposition 14 gives us 



Applying Lemma 3 with A = B = ^ and r = 
we obtain the final bound 


10 * 



„3/2 


□ 


10* -0i| < C^qinJ). 

V 

On the other hand by forward substitution we have 

2-1 

rfML 0 i = p, - y]] 
i=i 

i-1 

dM**0i = p* - dMij0^ . 

i=i 


F PROOF OF THEOREM 3 

Proposition 16. Let 0(u) denote the solution of (29) and 
denote by 0 the vector defined by 0i = 0(ui). Then, with 
probability at least 1 — d 

- loe(2/d)^/2 

max n|(dM'0)*-P*| < C - -j= - q{n,6/2f'. 

i^y/n Tl y Tl 

(31) 

Proof By definition of dM' and p^ we can decompose the 


and 







difference n[{d'M.'^p)i — p^) as: 

+ ^ 3 ( 1 ;*) - (Ti(s,i) + T 2 (s,i)) 

S = 1 \ 

nAG: 


— nAv,- 


N-1 
s —1 


i-l 


{Gtll 


• (32) 


where 


Ti(sA) 

'N- 

5—1 

T2(s,i) = 






i=i 

A-l\ AA^G* 


i-2 


S—1 


fM . 


i=i 


" 3 ( 5 , 1 ) = (2nMii{s) - ^fj^Gs{vi))il^{vi) 


and 


Is{Vi) = 


F'^{vi)Gs{vi)^{vi) + G'^{vi) / Fs{t) 


f{Vi 

+ G;{z;.) r F; 


Using the fact that ip solves equation (29) we see that 
X)f=i Csla{vi) = 0. Furthermore, using Lemma 2 as well 
as Proposition 7 we have 


^N-S 


< 


-q{n,S/2) 


Therefore we need only to bound for k = 1,2,3. After 
replacing the values of Gg and Fg by its definitions, Propo¬ 
sition 10 and the fact that ip{vi) < Cvf < C^q'^{n,S) 
imply that with probability at least 1 — J 

,^^^^log{2/sr-Y ,,^,3 

T 3 (s,u,)<C'—-^- q[n,S/2) . 

nP Jn 



(a) 



(b) 



Figure 7: (a) Log-normal density used to sample valua¬ 
tions. (b) True equilibrium bidding function and empiri¬ 
cal approximations (in dark grey) and theoretical con¬ 
fidence bound around true bidding function, (c) Rate of 
convergence to equilibrium as a function of the sample size, 
the red line represents the function 0 . 2 /y^n). 


It follows from Propositions 8 and 9 that |T 2 (s,i)| < 
g(rt,J/2)^. And the same inequality holds 
for Ti. Replacing these bounds in (32) and using the fact 

•p • jV — S 

^ A nN-s yields the desired inequality. □ 


We proceed to bound the term T 2 . The bound for T 1 can 
be derived in a similar manner. By using the definition of 
Gg and Fg we see that T 2 = (Yi) + "^ 2 ^^) where 


= (I 




-(5-l)G(t;i)^-2jy]AFfA 

i=i 


r^2\s,i) 

(EAFfrA- 


ip{t)pFP ^{t)f{t)dt]{s-l)G{vi) 


7=1 


\s-2 


Proposition 17. For any 6 > 0, with probability at least 
1-5 


max|V'(wi) - '4>i\ 

i 




q{n,S/2f + 


Cq{n,S/2) \ 

n3/2 ) 


Proof. With the same argument used in Corollary 4 we 
see that with probability at least 1 — i5 for i < we 
have \'ip{vi) — i/jJ < -^q(n,5). On the other hand, 
since = pi the previous Proposition implies that for 



























i > 

„|(<iM'w - ?;))j < 

Letting = |'0(t;i) — t/^j |, we see that the previous equation 
defines the following recursive inequality. 


ndM',e. < ,5/2)3 




i-2 

-Cn^dMLe,, 

j=i 


where we used the fact that dM' = 0. Since dML = 

- -AT —S —1 , 

2 Mii > 2C' ^jv-g-i after dividing the above inequality 
by dML we obtain 


Q < <5/2)3 


C 

n 


i-2 



Using Lemma 3 again we conclude that 


e* 


< e*- 


Uog(2/d)^/2 

V 


q{n,6/2)^ 


Cq{n,S/2) \ 

^ ) 


□ 


Theorem 3. If Assumptions 1, 2 and 3 are satisfied, then, 
for any d > 0, with probability at least 1 — 5 over the draw 
of a sample of size n, the following bound holds for all 
i G [1, n]: 


\P{vi)-P{vi)\ < 


/log(2/(5)^/2 

V 


q{n,S/2f+ 


Cq{n,5/2) 

fl3/2 


where q{n, 5) = ‘^ \og{nc/2S) with c defined in Assump¬ 
tion 1, and where C is some universal constant. 


Proof The proof is a direct consequence of the previous 
proposition and Proposition 15. □ 


G EMPIRICAL CONVERGENCE 

Here we present an example of convergence by the empiri¬ 
cal bidding functions to the true equilibrium bidding func¬ 
tion, even when not all technical assumptions are verified. 
We sampled valuations from a log-normal distribution of 
parameters p, = 0 and cr = 0.4 and calculated the empir¬ 
ical bidding function. Notice that in this case, the support 
of the distribution is not bounded away from zero (see Fig¬ 
ure 7(a). Figure 7(b) shows the true equilibrium bidding 
function as well as the range of empirical equilibrium func¬ 
tions (in dark grey) obtained after repeating this experiment 
10 times. Finally, the region in light gray depicts the pre¬ 
dicted theoretical confidence bound in 0{^. Figure 7(c) 
shows the rate of uniform convergence to the true equilib¬ 
rium function as a function of n. 










